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Abstract 

Hereditary Haemorrhagic Telangiectasia (HHT) is an autosomal dominantly inherited vascular disease characterized by the 
presence of mucocutaneous telangiectasia and arteriovenous malformations in visceral organs. HHT is predominantly 
caused by mutations in ENC and ACVRL 1, which both belong to the TGF-p signalling pathway. The exact mechanism of how 
haploinsufficiency of ENG and ACVRLl leads to HHT manifestations remains to be identified. As long non-coding RNAs 
(IncRNAs) are increasingly recognized as key regulators of gene expression and constitute a sizable fraction of the human 
transcriptome, we wanted to assess whether IncRNAs play a role in the molecular pathogenesis of HHT manifestations. By 
microarray technology, we profiled IncRNA transcripts from HHT nasal telangiectasial and non-telangiectasial tissue using a 
paired design. The microarray probes were annotated using the GENCODE v.16 dataset, identifying 4,810 probes mapping 
to 2,81 1 IncRNAs. Comparing HHT telangiectasial tissue with HHT non-telangiectasial tissue, we identified 42 IncRNAs that 
are differentially expressed {q<0.001). Using GREAT, a tool that assumes cis-regulation, we showed that differently 
expressed IncRNAs are enriched for genomic loci involved in key pathways concerning HHT. Our study identified IncRNAs 
that are aberrantly expressed in HHT telangiectasia and indicates that IncRNAs may contribute to regulate protein-coding 
loci in HHT. These results suggest that the IncRNA component of the transcriptome deserves more attention in HHT. A 
deeper understanding of IncRNAs and their role in telangiectasia formation possesses potential for discovering therapeutic 
targets in HHT. 
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Introduction 

Hereditary Haemorrhagic Telangiectasia (HHT) is an autoso- 
mal dominantly inherited vascular disease characterized by the 
presence of mucocutaneous telangiectasia and arteriovenous 
malformations (AVMs) in visceral organs, primarily the lungs, 
liver and brain. The most common clinical manifestation is 
spontaneous and recurrent epistaxis [1,2], originating from nasal 
telangiectasias, which can be difficult to prevent and can lead to 
severe anaemia. Moreover, pulmonary arteriovenous malforma- 
tions (PAVMs) are seen in approximately one third of patients and 
are potentially lethal, due to haemorrhage or shunting of blood 
through these abnormal blood vessels, which can cause paradox- 
ical embolisms and cerebral abscesses [2,3]. The clinical manifes- 
tations of HHT are extremely variable, even between family 
members, and age-dependent penetrance is present. The reported 
prevalence in Denmark is around 1 in 6,500 [4]. Around 85% of 
clinically diagnosed HHT patients [5-7] carry a mutation in either 
ENG (HHTl) or ACVRLl (HHT2). Patients with mutations in 
ENG or ACVRLl are clinically similar, as all reported manifesta- 
tions are known to occur in both. However, later onset and lower 
penetrance, less cerebral and pulmonary AVMs, but more liver 



involvement and a risk of developing pulmonary arterial 
hypertension, are observed in patients with ACVRLl mutations. 
Currently there is no cure for the disease; only symptomatic 
treatment is possible. 

HHT manifestations are thought to result from an imbalance in 
the process of angiogenesis. Angiogenesis is the development of 
new capillary blood vessels from pre-existing vessels and is 
controlled by different cytokines such as vascular endothehal 
growth factor (VEGF) and transforming growth factor beta 1 
(TGFfil). ENG and ACVRLl encode receptor proteins: Endoglin 
and ALKl (Activin A receptor type ITlike 1) (SKR3) respectively, 
which are components of the TGF-fi signalling pathway. The 
TGF-fi signalling pathway plays a complex and important role in 
development and homeostasis of many organs, including the 
vascular system. In the latter, it is involved in cell proliferation, 
migration, extracellular matrix formation, vascular smooth muscle 
cell difiFerentiation and vascular tone. As Endoglin and ALKl 
proteins are predominantly expressed in endothelial cells, these are 
primary cellular targets of the disease. Thus, HHT manifestations 
are caused by a disturbance in the TGF-fi signalling pathway 
[8- 1 0] . The exact mechanism of how haploinsufficiency of ENG 
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and ACVRLl leads to HHT manifestations remains to be 
identified. 

Our knowledge of the non protein-coding part of the human 
transcriptome is expanding rapidly, and recently attention has 
shifted towards the most numerous but still poorly understood 
group — the long non-coding RNAs (IncRNAs). Long non-coding 
RNAs (hicRNAs) are defined as eukaryote RNAs longer than 200 
nucleotides in length, without protein coding capacity. This 
arbitrary limit of 200 nucleotides distinguishes IncRNAs from 
small regulatory RNAs, such as microRNA and short interfering 
RNA (siRNA). Studies indicate that IncRNAs are generated 
through pathways similar to that of protein-coding genes, with 
similar histone-modification profiles, splicing signals, and exon/ 
intron lengths [11]. LncRNAs have been shown to be involved in 
major mechanisms of transcriptional and post-transcriptional 
regulation, such as targeting transcription factors, initiating 
chromatin remodelling, directing methylation complexes and 
blocking nearby transcription [12]. LncRNAs appear to control 
expression of protein-coding genes through both cis- and trans- 
acting pathways [1 1]. Moreover, IncRNAs tend to be expressed at 
lower levels than protein-coding genes and display more tissue- 
specific expression patterns [11]. 

Of the many identified IncRNAs, only few have been 
characterized functionally [13]. A complete catalogue of IncRNAs 
is not yet available; nonetheless, in 2012 the GENCODE 
consortium published the most complete set of human IncRNAs 
to date [11], comprising 9277 manually annotated genes 
producing 14,880 transcripts. GENCODE annotate IncRNAs 
according to their localization regarding the nearest known 
protein transcripts: exonic, intronic, overlapping or intergenic 
[II]. 

To identify potential HHT therapeutic targets, further knowl- 
edge on how a disturbance of the TGF-[3 signalling pathway leads 
to HHT manifestations is needed. As IncRNAs are increasingly 
recognized as key regulators of gene expression we assessed the 
IncRNA expression in HHT tissue. Thus, the purpose of this study 
was to investigate the possible involvement of IncRNAs in the 
molecular pathogenesis of the telangiectasia formation in HHT 
patients. This was done by expression profiling of IncRNAs in 40 
telangiectasial and 40 non-telangiectasial samples from HHT 
patients using microarray technology. 

Materials and Methods 

Ethics statement 

This study was approved by the ethics committee of Southern 
Denmark (S-20()90131), and the patients provided verbal and 
written informed consent to participate in this study in accordance 
with the Declaration of Helsinki. 

HHT patients 

Nasal mucosal biopsy specimens of telangiectasial and non- 
telangiectasial tissue in pairs from HHTl (n=19) and HHT2 
(n = 21) patients were used in this study. The nasal mucosa was 
anaesthetized with gauze impregnated with Lidocain phenyle- 
phrinehydrocloride. No infiltration anaesthesia was used. The 
samples were collected with a Weil Nasal Forceps by an 
experienced rhinologist and contained either macroscopicly visible 
telangiectasia or natural mucosa. All patients showed signs of 
HHT according to the Curasao Criteria [14] and carried the 
familial pathogenic mutation in either ENG or ACVRLl. The 80 
paired samples represented 14 different HHTl families and 14 
different HHT2 families. The familial mutations were of different 
types and distributed throughout the two genes (Table 1) [15]. The 



patients were aged 28-56 years (median age 40) and included 21 
women and 19 men. Patients were diagnosed and treated in the 
national HHT Centre of Denmark in Odense University Hospital. 

RNA isolation and quality 

Samples were collected in RNA/afer Solution (Life Technologies) 
and stored at — 20°C. Total RNA was isolated using the RNeasy 
Micro Kit according to the RNeasy® Micro Handbook (Qiagen). 
The quantit)' of RNA was assessed with the NanoDrop 
spectrophotometer ND-8000 (NanoDrop Technologies), and 
RNA quality was assessed by the Agilent 2100 Bioanalyzer 
(Agilent Technologies). 

Microarray labelling, hybridization, and scanning 

Sample labelling and array hybridization were performed 
according to the Two-Color Microarray-Based Gene Expression 
Analysis — Low Input Quick Amp Labeling — ^protocol (Agilent 
Technologies) using the SurePrint G3 Human Gene Expression 
8x60 microarray format (Agilent Technologies). Samples were 
labelled with Cy5 and Universal Human Reference RNA 
(Stratagene) was labelled with Cy3 and used as a common 
reference on all arrays. 

Data pre-processing and annotation of IncRNAs 

AgHent Feature Extraction software v. 10.7.3.1 (Agilent 
technologies) was used to analyse acquired array images. Data 
were then within-array normalized by Loess normalization 
method and between-array normalized by Qiaantile normaliza- 
tion. The normalized values were used to calculate log2 
transformed Cy5/Cy3 ratios. Replicate probes were collapsed 
calculating the median. Missing expression values were imputed 
by A:-nearest neighbours averaging (A;= 10). Date pre-processing 
was performed using the R (R Core team 2012) package limma 
[16]. Microarray data were deposited to the Gene Expression 
Omnibus (GSE53515). 

All the 42,164 probes of the Agilent SurePrint G3 array were re- 
annotated using GENCODE v. 16 gene annotation database 
(www.gencodegenes.org) [11]. The genomic coordinates of the 
probes in the Agilent array were matched to the genomic 
coordinates of the IncRNAs from the GENCODE v.l6, to identify 
the probes covering IncRNAs. LncRNAs were included if 55 base 
pairs overlapped with the 60-mer array probes. 

Data analysis 

Data analyses were performed using the Qlucore Omics 
Explorer 2.3 software I'Qlucore). Differentially expressed IncRNAs 
comparing telangiectasial and non-telangiertasial tissue were 
ranked according to statistical significance determined by two- 
group comparison (paired t-test). This was done for the groups 
HHTl and HHT2 seperately and subsequently for the total group 
of HHT. Multiple testing was adjusted for by the Benjamini- 
Hochberg method. Differentially expressed IncRNAs were chosen 
for further evaluation (q<0.15). Principal component analysis 
(PCA) and hierarchical clustering were performed in Qlucore 
Omics Explorer 2.3 to examine whether telangiectasial and non- 
telangiectasial samples could be separated. 

Genomic Regions Enrichment of Annotation 

To assess the potential ctj-regulatory effect of the cUfferentially 
expressed IncRNAs in HHT, we used the Genomic Regions 
Enrichment of Annotations Tool (GREAT) [1 7] . This program 
has genomic coordinates as inputs and outputs nearby genes and 
their ontologies. It analyses the functional significance of cis- 
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Table 1. Genotypes and phenotypes of the HHT patients who participated in this study. 



Gene 


Family No. 


Nucleotid change 


Amino acid change 


Age 


Sex 


Phenotype 


ENG 


9 


C.360OA 


p.Tyrl20* 


49 


M 


E,T,P,G,F 


ENG 


9 


C.360OA 


p.Tyrl20* 


53 


p 


E.T.F 


ENG 


13 


C.360OA 


p.Tyrl20* 


54 


p 


E,T,P,G,F 


ENG 


13 


C.360OA 


p.Tyrl20* 


30 


p 


E,T,P,F 


ENG 


15 


C.361-2A>G 


p.? 


47 


p 


E,T,G,F 


ENG 


15 


C.361-2A>G 


P? 


38 


M 


E,T,P,F 


ENG 


15 


c.361-2A>G 


p.? 


40 


p 


E,T,F 


ENG 


24 


C.360OA 


p.Tyrl20* 


40 


p 


E,T,F 


ENG 


24 


C.360OA 


p.Tyrl20* 


46 


p 


E,T,P,G,F 


ENG 


37 


C.360OA 


p.Tyrl20» 


51 


M 


E,T,P,F 


ENG 


38 


C.360OA 


p.Tyrl20* 


37 


p 


E,T,F 


ENG 


49 


C.8210T 


p.Thr274lle 


47 


p 


E,T,F 


ENG 


61 


c.817-3T>G 


P? 


49 


p 


E,T,F 


ENG 


65 


c.ll66_1168delTCT 


p.Phe389del 


35 


p 


T,P,F 


ENG 


87 


C.808OT 


p.Gln270* 


38 


M 


T,P,F 


ENG 


92 


c.219+lG>T 


P? 


44 


M 


E,T,P 


ENG 


93 


c.l550_1551delTG 


p.Val517Glufs«10 


37 


M 


E,T,F 


ENG 


94 


p.l582_1583clel 


p.Pro528Alafs*38 


30 


M 


E,T,F 


ENG 


96 


C.2770T 


p.Arg93* 


46 


M 


E,T,P,F 


ACVRLl 


8 


C.1120OT 


p.Arg374Trp 


55 


M 


E,T,F 


ACVRLl 


8 


C.1120OT 


p.Arg374Trp 


28 


M 


E,T,F 


ACVRLl 


18 


C.14680T 


p.Gln490» 


43 


M 


E,T,F 


ACVRLl 


18 


C.14680T 


p.Gln490* 


44 


F 


E,T,P,F 


ACVRLl 


18 


C.14680T 


p.Gln490* 


41 


M 


E,T,F 


ACVRLl 


20 


C.430OT 


p.Argl44» 


47 


M 


E,T 


ACVRLl 


42 


c.1135G>A 


p.Glu379Lys 


32 


F 


E,T,F 


ACVRLl 


43 


C.626-30G 


p.? 


38 


F 


E,T,F 


ACVRLl 


43 


C.626-30G 


P? 


36 


F 


T,F 


ACVRLl 


46 


C.1013T>A 


p.Val338Asp 


39 


M 


E,T,F 


ACVRLl 


47 


C.1120OT 


p.Arg374Trp 


40 


F 


E,T,F 


ACVRLl 


56 


c.l-?_1048+?del 


p.O? 


44 


F 


E,T,F 


ACVRLl 


57 


C.143G>A 


p.Gly48Glu 


34 


M 


E,T,F 


ACVRLl 


57 


c.l43G>A 


p.Gly48Glu 


32 


F 


T,F 


ACVRLl 


67 


c.266G>A 


p.Cys89Tyr 


24 


F 


E,T,F 


ACVRLl 


67 


c.266G>A 


p.Cys89Tyr 


31 


M 


E,T,F 


ACVRLl 


69 


c.1232G>A 


p.Arg41 IGIn 


39 


M 


E,T,F 


ACVRLl 


71 


C.143G>A 


p.Gly48Glu 


44 


F 


E,T,F 


ACVRLl 


82 


c.139_140insCG 


p.Arg47Profs*8 


48 


M 


E,T,F 


ACVRLl 


88 


C.155del 


p.Thr52Lysfs'2 


56 


F 


E,T,G,F 


ACVRLl 


88 


c.l55del 


p.Thr52Lysfs*2 


50 


M 


E,T,F 



Abbreviations: AVM, arteriovenous malformation; E, epistaxis; T, telangiectasia; 
bleeding; H, Hepatic AVM; F, family history. M, male; F, female. 
doi:l 0.1 371 /journal.pone.0090272.t001 



P, pulmonary AVM; C, cerebral AVM; G, gastrointestinal telangiectasia/gastrointestinal 



regulatory regions identified by localized measurements of DNA 
binding events across an entire genome. 

Correlation analysis 

To identify potential frara-regulated transcripts, we assessed the 
correlation between the expression levels of the 1 0 highest ranking 
statistically significantly differentially expressed IncRNAs (HHT 



total) and every transcript present on the Agilent array by 
calculating Pearson's correlation coefficient. The transcripts were 
ranked according to the corrected p-values for the correlation 
coefficients; this was performed for each IncRNA of interest. 
Correction for multiple testing was performed using the Bonferoni 
correction. Manhattan plots were drawn using the Integrative 
Genome Viewer (IGV) version 2.3.18 [18], based on each of these 
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Table 2. Top 42 long non-coding RNAs (q<0.001) differentially expressed in HHT telangiectasia. 





Ensembl gene ID 


gencode.vl e_GenomicCoordinates 


q-value 
(FDR) 


Fold 
change 


Gene blotype 


HGNC symbol 


ENSG00000249772.1 


chr5:80409204-8041 0671_R 


1 .83E-06 


0.86 


antisense 


- 


ENSG00000230544. 1 


chr13:1 14586640-1 14588308_F 


1 .50E-05 


0.88 


lincRNA 


LINC00453 


ENSG0000021 5231 .3 


chr5:5034472-50701 17_F 


1 .50E-05 


0.88 


lincRNA 




ENSG00000237548. 1 


chr9:1 2464691 5-1 24725998_R 


1 .50E-05 


0.86 


processed_transcript 


TTLL11-IT1 


ENSG00000263753.1 


chrl 8:5232875-5246507_F 


0.0001 1 


0.90 


lincRNA 


LINC00667 


ENSG00000231 133.1 


chr20:61 7271 50-61 733631_R 


0.00014 


0.90 


processed_transcript 


HARIB 


ENSG0000025621 8.1 


chr12:5475214-5476940_R 


0.00015 


0.90 


lincRNA 




ENSG00000241269.1, 
ENSGOOOOOl 88365.3 


chr7:5459458-5462753_F 


0.00022 


0.85 


antisense 




ENSG00000226496.1 


chr21 :4251 3427-42520060_R 


0.00023 


1.14 


antisense 


L1NC00323 


ENSG00000248176.1 


chr4:291 1 9930-29204392_F 


0.00039 


0.89 


lincRNA 




EN5G00000259484.1 


chrl 5:571 51866-5721 0697_R 


0.00039 


0.86 


processed_transcript 




ENSG00000206129.3 


chrl 8:53670844-53858493_R 


0.00042 


1.11 


lincRNA 




ENSG00000235285.1 


chrl 3:44720606-44732358_R 


0.00042 


0.90 


sense_intronic 


SMIM2-IT1 


ENSG00000147753.5 


chrY:6317509-6325947_F 


0.00042 


1.13 


lincRNA 


"nTY7 


ENSG00000240453.1 


chr1:745489-753092_R 


0.00042 


0.84 


processed_transcript 




ENSG00000259150.1 


chrl 5:26360960-263781 84_F 


0.00042 


1.09 


lincRNA 


LINC00929 


ENSG00000255471.1* 


chrl 1 :86603256-86636079_R 


0.00045 


0.81 


antisense 




ENSG00000237036.3 


chrl 0:31 596646-31 60881 0_R 


0.00056 


1.11 


antisense 




ENSG00000233 154.1 


chrl :1 1 6966346-1 1 7021 464_R 


0.00065 


0.91 


lincRNA 


- 


ENSGOOOOOl 97251. 3 


chr6:33553883-33561 1 1 5_R 


0.00068 


1.10 


antisense 


LINC00336 


ENSG00000248176.1 


chr4:291 1 9930-29204392_F 


0.00073 


0.92 


lincRNA 




ENSG000002541 54.3 


chrl :1 77897923-1 780071 42_R 


0.00073 


0.88 


processed_transcript 




ENSG0000021 5374.4 


chr8:71 591 33-721 2876_R 


0.00073 


0.90 


processed_transcript 




ENSGOOOOOl 95096.3 


chr2:21 41 41 276-21 41 48929_R 


0.00073 


0.92 


processed_transcript 




ENSG00000259758. 1 


chr8:l 41 530255-1 41 539600_R 


0.00073 


1.13 


lincRNA 


CASC7 


ENSG00000264772. 1 , 
ENSG00000265500.1 


chrl 7:7466604-7485342_F 


0.00074 


1.26 


processed_transcript 




ENSG00000229563.1 


chrX:45364633-45489447_F 


0.00076 


1.10 


processed_transcript 




ENSG00000203325.3 


chrl :3251 7892-32539075_R 


0.00076 


0.88 


antisense 




ENSG00000232956.3* 


chr7:45022622-45026560_R 


0.00076 


1.20 


lincRNA 




ENSG00000232021.2* 


chr4:1 09088681-1 091 77992_F 


0.00076 


0.90 


processed_transcript 


LEF1-AS1 


ENSG00000250195.1 


chr4:1 39741 1 08-1 39933800_R 


0.00076 


1.21 


antisense 




ENSG00000250608.1 


chr3:1 3 1 043936-1 3 1 1 003 1 9_R 


0.00076 


1.11 


processed_transcript 




ENSG00000266952.1 


chrl 8:61 88031 7-61 92729C_R 


0.00078 


0.91 


lincRNA 




ENSG00000259334.1 


chr14:24391456-24403777_R 


0.00079 


1.11 


lincRNA 


LINC00596 


ENSG00000249364.1 


chr5:66675206-671 01 066_F 


0.00079 


0.91 


processed_transcript 


- 


ENSG00000231 185.2* 


chr5:141 704858-142051 566_F 


0.00086 


0.93 


processed_transcript 




ENSG00000232046.1 


chr2:66801 1 62-66957289_F 


0.00086 


0.91 


processed_transcript 




ENSG00000230133.1 


chr20:241 80403-24205224_F 


0.00086 


0.90 


lincRNA 




ENSG0000021 5808.2 


chrl :238643684-238649323_R 


0.00090 


1.11 


processed_transcript 




ENSGOOOOOl 35253.9 


chr7:1 28502505-1 28550773_R 


0.00094 


1.08 


processed_transcript 


KCP 


ENSG00000233251.3 


chr2:56400669-5641 2905_R 


0.00094 


1.09 


processed_transcript 




ENSG00000245910.3 


chr8:6783391 9-67838633_R 


0.0010 


0.85 


processed_transcript 


SNHG6 



*Are part of the IncRNAs identified by GREAT analysis. 
doi:1 0.1 371 /journal.pone.0090272.t002 
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HHT (617) 



472 



92 



HHT2 (114) 



47 



16 



HHTl (70) 



17 



q<0.05 



Figure 1. Venn diagram. The number of statistically significantly 
differentially expressed long non-coding RNAs in the three groups 
(q<0.05). The numbers in the overlapped areas indicate the IncRNAs 
that are differentially expressed in the multiple groups. 
doi:1 0.1 371/journal.pone.0090272.g001 

correlation-sets for individual statistically significantly difiFerentially 
expressed IncRNAs. 

Results 

Annotation of IncRNAs 

AH the 42, 164 probes of the Agilent SurePrint G3 array were re- 
annotated using GENCODE v. 16 gene annotation database 
(www.gencodegenes.org)[l 1]. We identified 5,069 probes mapping 
to 2,939 IncRNAs. These 5,069 probes included both intronic and 
exonic IncRNA probes. Of these 5,069 probes, 404 mapped to 
overlapping IncRNAs and mRNAs - of these, 259 probes were 
excluded as they mapped to overlapping mRNA exons and 
IncRNA introns - only probes mapping to overlapping mRNA 
exons and IncRNA exons were included. Further analysis was 
performed using the 4,8 1 0 remaining IncRNA probes, mapping to 
2,81 1 IncRNAs, covering almost 31% of the GENCODE IncRNA 
dataset (Figure SI). 



Identification of differentially expressed IncRNAs in HHT 
telangiectasia 

We detected 6 1 7 statistically significandy differentially expressed 
IncRNAs (q<0.05, paired t-rest) when comparing telangiectasial 
and non-telangiectasial samples in the total HHT group (80 paired 
samples), of which 42 were highly statistically significant 
(q<0.001). Of the 42 differentially expressed IncRNAs (listed in 
Table 2), 16 were upregulated and 26 downregulated. None of 
these 42 IncRNAs are characterized functionally. 

Analysis of the HHTl samples resulted in 70 differentially 
expressed IncRNAs (q<0.05, paired t-test) when comparing 
telangiectasial and non-telangiectasial samples, and in HHT2 
analysis resulted in 1 14 differentially expressed IncRNAs (q<0.05, 
paired t-test). The number of common IncRNAs (q<0.05) in the 
three groups is rather small as shown in Figure 1. 

The complete results of the paired t-tests in HHT and the 
subgroups HHTl and HHT2 are provided in Table SI, S2, and 
S3. 

Principal component analysis (PCA) and hierarchical 
clustering 

PCA applied to the difiFerentially expressed IncRNAs (q<0.15) 
revealed a clear separation of the telangiectasial and non- 
telangiectasial samples, regarding HHTl (Figure 2a), HHT2 
(Figure 2b) and HHT (Figure 2c). No statistically significant 
difference in expression values between HHTl telangiectasial 
tissue and HHT2 telangiectasial tissue was observed. Even so, the 
data suggest that there is a minor difference as the comparison and 
grouping of the subgroups introduce more variation in the PCA 
plots (Figures 2a and 2b compared with Figure 2c). 

Hierarchical clustering: amongst the differentially expressed 
IncRNAs (q<0.001) in the HHT group both up- and down- 
regulation was observed (Figure 3). Hierarchical clustering for 
HHTl and HHT2 is shown in Figure S6 and S7. 

Genomic Regions Enrichment of Annotation 

HHTl. The differentially expressed IncRNAs with q<0.15 
(n = 617) were loaded into GREAT. Of these, 31% map within 50 
kHobases (kb) of an annotated gene, and 89% map within 500 kb 
of a gene (Figure S2). Gene ontology [19] (www.geneontology.org) 
analysis, performed in GREAT, of the 6 1 7 neighbouring genomic 






□ HHTl non-telangiectasia 
■ HHTl telangiectasia 



□ HHT2 non-telangiedasla 
■ HHT2 telangiectasia 



I HHT non-telangiectasia 
] HHT telangiectasia 



Figure 2. Principal component analysis (PCA) of the three groups, a. PCA applied to the differentially expressed long non-coding RNAs 
(IncRNAsXpaired t-test, q<0.15) revealed a clear separation of the telangiectasial and non-telangiectasial samples, regarding HHTl. b. PCA applied to 
the differentially expressed IncRNAs (paired t-test, q<0.15) revealed a clear separation of the telangiectasial and non-telangiectasial samples, 
regarding HHT2. c. PCA applied to the differentially expressed IncRNAs (paired t-test, q<0.15) revealed a clear separation of the telangiectasial and 
non-telangiectasial samples in the total group of HHT. There is no clustering into subgroups. 
doi:1 0.1 371 /journal.pone.0090272.g002 
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Figure 3. Hierarchical clustering HHT. Hierarchical clustering for the 80 HHT paired samples comparing the expression in telangiectasial (yellow) 
and non-telangiectasial (pink) tissue, using the top variant IncRNAs (n = 42)(q<0.001). In the heat map rows correspond to long non-coding RNAs 
(IncRNAs) and columns to samples. Red indicates elevated expression, green indicates reduced expression. In the 42 differentially expressed IncRNAs; 
16 are up-regulated and 26 are down-regulated. 
doi:1 0.1 371 /journal.pone.0090272.g003 



regions (826 genes) is seen in Table 3. The ontologies of those loci 
concerned six gene ontology (GO) terms (binomial false discovery 
rate (FDR) q-value<0.001), of which three were considered key 
terms as they are central to HHT pathogenesis: blood vessel 
morphogenesis (GO:0001568, 3747 gene products), blood vessel 
development (00:0001568, 3747 gene products) and vasculogen- 
esis (Synonym: Vascular morphogenesis, GO:0001570, 557 gene 
products). Table 4 (and Tables S4— S6 in File SI) lists the genes 
involved in the three GO terms and their relative positions. In the 



GO terms tree (Figure 4) the close connection and overlap of the 
three mentioned GO terms can be seen. 

HHT2. The differentially expressed IncRNAs with q<0.15 
(n = 640) were loaded into GREAT. Of these, 31% map within 50 
kHobases (kb) of an annotated gene, and 89% map within 500 kb 
of a gene (Figure S3). Gene ontology analysis of the 640 
neighbouring genomic regions (887 genes) is seen in Table 5. 
The ontologies of those loci concerned five GO terms (binomial 
FDR q-value<0.001), of which especially vasculogenesis was 



Table 3. Gene ontology (GO) analysis of 61 7 genomic regions (826 genes) located nearby differentially expressed long non-coding 
RNAs in the HHTl group. 





Ontology 


Term name 


Term ID 


BInom FDR 
Q-value 


BInom fold 
enrichment 


Hyper FDR 
Q-value 


Hyper fold 
enrichment 


GO Biological Process 


blood vessel morphogenesis 


GO:0048514 


9.21 e-7 


2.29 


4.59602e-5 


2.72 


GO Biological Process 


blood vessel development 


GO:0001568 


6.81 e-6 


2.07 


5.89303e-5 


2.52 


GO Biological Process 


immune system development 


GO:0002520 


7.30e-5 


2.06 


5.60561 e-3 


2.06 


GO Biological Process 


hemopoietic or lymphoid organ 
development 


GO:0048534 


1 .74e-4 


2.04 


1.14257e-2 


2.04 


GO Biological Process 


hemopoiesis 


GO:0030097 


4.03e-4 


2.06 


1.65733e-2 


2.06 


GO Biological Process 


vasculogenesis 


GO:0001570 


6.00e-4 


3.43 


5.61511e-4 


4.90 



doi:1 0.1 371 /journal.pone.0090272.t003 



PLOS ONE I www.plosone.org 6 March 2014 | Volume 9 | Issue 3 | e90272 



Long Non-Coding RNA and HHT 



Table 4. Overia 


p between gene-genomic regions association tables for the different GO terms. 








GO Blood vessel 

morphogenesis 

(HHT1) 


GO Blood vessel 

development 

(HHT1) 


GO GO GO GO Blood vessel 
Vasculogenesis Vasculogenesis Vasculogenesis morphogenesis 
(HHT1) (HHT2) (HHT) (HHT) 


Gene located on 
Chromosome 


Foldchange 
(ENSG) 


AMOT 


AMOT 


moT 


2 


1.09 






ANGPT2 


8 


1.13 


APOB 


APOB 


APOB 


14 


1.06-1.07 


BMP4 


BMP4 


BMP4 


14 


0.92-0.93 


CAVl 


CAVl 


CAVl CAVl CAVl CAVl 


7 


1.01-1.18 


CCM2 


CCM2 


CCM2 CCM2 CCM2 CCM2 


7 


1.16-1.25 




CDH5 




16 


0.89 






aTED2 CITED2 aTED2 


6 


0.92-0.93 


CXCLU 


CXCLU 


CXCLU 


10 


0.89-0.92 


CXCR4 


CXCR4 




2 


1.10 


CYPIBI 


CYPIBI 


CYPIBI 


2 


0.89-0.90 


FGFl 


FGFl 


FGFl 


5 


0.09-1.16 


FNl 


FNl 


FNl 


2 


0.88-1.22 


FOXF1 


FOXFl 


FOXFl FOXFl FOXFl FOXFl 


16 


1.06-1.09 




FOXOl 




13 


1.11 


FZD4 


FZD4 


FZD4 FZD4 FZD4 FZD4 


11 


0.65-1.22 


GREMl 


GREMl 




15 


1.09 


HDAC7 


HDAC7 


HDAC7 HDAC7 HDAC7 


12 


0.91-0.92 


HESl 


HESl 


HESl 


3 


0.91-1.14 






ITGBl 


10 


1.09 


ITGA5 


ITGA5 




12 


1.19-1.20 


JAG1 


JAGl 




20 


1.27 


JAM3 


JAMS 




n 


0.95 


LEFl 


LEFl 


LEFl 


4 


0.87-0.90 




MEF2C 




5 


1.09-1.11 


MEISl 


MEISl 


MEISl 


2 


0.57-1.14 


N0X5 


N0X5 




15 


0.93 






NR2F2 


15 


0.92 






NRPl 


10 


1.09 


NRXN1 


NRXNl 


NRXNl 


2 


0.94-0.95 


PRKX 


PRKX 




X 


1.08 


PR0K2 


PR0K2 


PR0K2 


3 


1.08-1.20 


PRSS23 


PRSS23 


PRSS23 PRSS23 PRSS23 PRSS23 


11 


0.65-0.93 


RASA1 


RASAl 


RASAl RASAl RASAl RASAl 


5 


0.92-1.16 






SHH SHH SHH 


7 


0.88-1.09 


S0X4 


S0X4 


S0X4 


6 


0.92-1.12 


STAB2 


STAB2 


STA82 


12 


0.92-0.94 


T 


T 


T T 


6 


1.16-1.28 


TGFB2 


TGFB2 


TGFB2 


1 


1.08-1.13 


THBSl 


THBSl 


THBSl 


15 


0.93-1.17 


TIPARP 


TIPARP 


TIPARP TIPARP TIPARP TIPARP 


3 


0.84-1.08 






VEGFA VEGFA VEGFA 


6 


1.07-1.24 


WARS 


WARS 




14 


0.92 




WNT2 




7 


0.93 


WNT7A 


WNT7A 


WNT7A WNT7A WNT7A 


3 


0.91-1.10 






WNT7B 


22 


0.90 


ZC3H12A 


ZC3H12A 




1 


0.92 


ZFPM2 


ZFPM2 


ZFPM2 ZFPM2 ZFPM2 ZFPM2 


8 


1.13-1.23 


ZMIZl 


ZMIZl 


ZMIZl ZMIZl ZMIZl ZMIZl 


10 


0.01-1.11 
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q-values can be seen of tables S4-S9. Genes that are present for all six GO-terms are highlighted in bold. 
doi:l 0.1 371 /journal.pone.0090272.t004 



GO:0008150 




all 


biologlcal_process 




all 



GO:0032501 
multicellular 
organismal process 




GO: 0044699 
single-organism 
process 



GO:0009987 
cellular process 



\ / 



00:0032502 
developmental process 



GO:0044707 
single-multicellular 
organism process 



\ / 



GO:0009653 
anatomical structure 
morphogenesis 




GO:0044763 
single-organism 
cellular process 



GO:0048869 
cellular 
developmental process 



GO:0030154 
cell differentiation 



GO:0044767 
single-organism 
developmental process 



GO:0048514 
blood vessel 
morphogenesis 



GO:0001570 
vasculogenesis 
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Figure 4. Gene Ontology terms tree. Graphical view of the gene ontology (GO) terms tree showing part of the GO terms in the ontology 
biological process and their connections. Our three GO terms; blood vessel development, blood vessel development, and vasculogenesis (in red 
boxes), are the more narrow and specific terms in the tree and overlap by a number of genes. 
doi:1 0.1 371 /journal.pone.0090272.g004 



Table 5. Gene ontology (GO) analysis of 640 genonnic regions (887 genes) located nearby differentially expressed long non-coding 
RNAs in the HHT2 group. 




Ontology 


Term name 


Term ID 


Binom FDR Q- 
value 


Binom fold 
enrichment 


Hyper FDR Q- 
value 


Hyper fold 
enrichment 


GO Biological 
Process 


vasculogenesis 


GO:0001570 


1 .90e-5 


3.80 


6.20e-5 


4.91 


GO Biological 
Process 


positive regulation of neuron 
differentiation 


GO:0045666 


4.95e-5 


3.13 


4.51 e-4 


4.41 


GO Biological 
Process 


regulation of mesenchymal cell 
proliferation 


GO:0010464 


3.90e-4 


3.28 


3.45e-3 


5.00 


GO Biological 
Process 


regulation of DNA binding 


GO:0051101 


7.35e-4 


3.24 


2.21 e-5 


5.00 


GO Biological 
Process 


regulation of binding 


GO:0051098 


8.32e-4 


2.43 


1.13e-5 


3.38 
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interesting. Table 4 (and Table S7 in File SI) lists the genes 
involved in this GO term and their relative positions. 

HHT. The dilTerentially expressed IncRNAs with q<0.05 
(n = 6 1 7) were loaded into GREAT. A more stringent cut-ofT of 
q<0.05 was applied here, as the sample size was larger. Of these 
IncRNAs, 31% map within 50 kilobases (kb) of an annotated gene, 
and 90% map within 500 kb of a gene (Figure S4). Gene ontology 
analysis of the 617 neighbouring genomic regions (837 genes) is 
seen in Table 6. Applying a cut-off of q<0.001 resulted in three 
GO terms: vasculogenesis, lactation and blood vessel morphogen- 
esis. Table 4 (and Tables S8 and S9 in File SI) lists the genes 
involved in the GO terms 'vasculogenesis' and 'blood vessel 
morphogenesis' and their relative positions. 

By GREAT analysis, 63 IncRNAs were identified, which is 
equal to 10.2% of the differentially expressed IncRNAs in the 
HHT group (n=:617, q<0.05). Concordingly, 89.8% of statisti- 
cally significandy differentially expressed IncRNAs in the HHT 
group are not part of the three selected statistically significant GO 
terms in GREAT. 

Correlation analysis 

To identify potential /raro-regulated transcripts, we assessed the 
correlation between the expression levels of the 10 highest ranking 
differentially expressed IncRNAs in the Hereditary Haemorrhagic 
Telangiectasia (HHT) total group and all transcripts present on 
the Agilent array, by calculating Person's correlation coefficient. 



The Manhattan plots drawn on this basis showed that several 
statistically significant correlations were present for each IncRNA 
in question (40-3725 correlations, p<0.05 (Bonferoni-corrected)(- 
data not shown)), and that the correlating transcripts were 
mapping to many different chromosomes (Figure 5 shows the 
top three ranking IncRNAs). More than half the correlated 
transcripts were IncRNAs (data not shown). In the cases of 
multiple IncRNAs neighbouring a gene you see a clear correlation 
of foldchange /direction; meaning that if one IncRNA neighbour- 
ing a gene is upregulated, in most cases (37/49) so are the other(s) 
(Table 4). 

Discussion 

To our knowledge, this is the first study to assess the regulatory 
effects of long noncoding RNAs (IncRNAs) in HHT affected tissue. 
Using microarray technology, we identified IncRNAs that are 
statistically significantly difierentiaUy expressed in HHT telangiec- 
tasial tissue compared with HHT non-telangiectasial nasal tissue. 
Using GREAT, a tool which assumes CM-regulation, we also 
showed that those IncRNAs are enriched for genomic loci involved 
in key pathways concerning HHT. The comparison of expression 
profiles between HHT telangiectasia! and HHT non-telangiecta- 
sial nasal tissue clearly showed the diflferential expression of a 
substantial number of IncRNAs. Principal component analysis 
(PCA) and hierarchical clustering were able to discriminate 
between HHT telangiectasia! and HHT non-telangiectasial nasal 



Table 6. Gene ontology (GO) analysis of 61 7 genonnic regions (837 genes) located nearby differentially expressed long non-coding 
RNAs in the total HHT group. 











Binom FDR Q- 


Binom fold 


Hyper FDR Q- 


Hyper fold 


Ontology Term name 


Term ID 


value 


enrichment 


value 


enrichment 


GO biological process vasculogenesis 


GO:0001570 


1.71 e-5 


3.94 


4.92e-5 


5.21 


GO biological process lactation 


GO:0007595 


2.06e-5 


6.51 


9.57e-3 


4.85 


GO biological process blood vessel morphogenesis 


GO:0048514 


1 .92e-4 


2.01 


4.06e-4 


2.40 
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Figure 5. Manhattan plots. Manhattan plots showing significance of correlation between the three top statistically significantly differentially 
expressed long non-coding RNAs' (IncRNAs) expression and expression of all other genes at the microarray. The negative logarithm (-log,o) p-values 
of the Pearson correlation were plotted across chromosomes. The Bonferoni-corrected significance level is indicated by the dashed line (p<0.05). a. 
ENSG00000249772.1 (Chromosome 5) has 144 statistically significantly correlated transcripts, of which 62.5% are other IncRNAs. b. 
ENSG00000230544.1 (Chromosome 13) has 178 statistically significantly correlated transcripts, of which 61% are other IncRNAs. c. 
ENSG00000215231.3 (Chromosome 5) has 158 statistically significantly correlated transcripts, of which 63% are other IncRNAs. For comparison, 
only 1 1% of the probes across the microarray map to IncRNAs. 
doi:1 0.1 371/journal.pone.0090272.g005 



tissue, demonstrating an HHT telangiectasia! profile. However, 
gene expression profiling did not allow us to discriminate between 
HHTl and HHT2, possibly because of the size of the study and 
lack of statistical power. Nonetheless, HHTl and HHT2 had 
somewhat different groups of difFerentiaUy expressed IncRNAs 
with a smaller number of common IncRNAs, which may be due to 
the minor phenotypical and pathway differences of HHTl and 
HHT2. 

The bicRNA list used in this study was retrieved from the 
GENCODE V. 16 dataset, which uses a combination of manual 



annotation, computational analysis and targeted experimental 
validation and is the largest catalogue of human IncRNAs to date. 
It was used in order to retrieve optimal IncRNA annotation. To 
minimize the number of probes mapping only or mostly to 
mRNAs, we filtered out probes mapping to overlapping mRNA 
exons and IncRNA introns. 

LncRNAs appear to control expression of protein-coding genes 
through both cis- and fraar-acting pathways. As dt-regulation is 
reported to be more pronounced for IncRNAs [11], we chose to 
use GREAT for further analysis of our data. GREAT is a tool that 
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can be used to relate the IncRNA transcriptome to biology using 
well-known mRNA function and gene ontology (GO) terms. Using 
GREAT, three interesting and statistically significant GO terms 
were the result: blood vessel morphogenesis, blood vessel 
development, and vasculogenesis. All three were located inside 
the top 6 resulting GO terms. Blood vessel morphogenesis is 
defined as: 'The process in which the anatomical structures of 
blood vessels are generated and organized. The blood vessel is the 
vasculature carr\'ing blood'. Blood vessel development is defined 
as: 'The process whose specific outcome is the progression of a 
blood vessel over time, from its formation to the mature structure'; 
and vasculogenesis: 'The differentiation of endothelial cells from 
progenitor cells during blood vessel development, and the de novo 
formation of blood vessels and tubes'. These GO term results are 
interesting as HHT manifestations are thought to result from 
imbalanced angiogenesis. The results may point to a causal effect 
of IncRNAs regulating mRNAs [:entral to vasculogenesis, angio- 
genesis, and perhaps even HHT telangiectasia formation. 

In most of the cases of multiple IncRNAs neighbouring a gene, a 
clear correlation of direction was observed. However, a correlation 
of direction was not present in all cases, which could indicate that 
a more fine-tuned regulation is being played out. Each GO term 
involved both up- and downregulated IncRNAs, which may 
suggest local cw-regulation at multiple sites. 

The three GO terms contain repeatedly mentioned genes. 
Accordingly, that indicates a central group of genes that seems to 
be CM-regulated by diSerentially expressed IncRNAs. This central 
group of genes primarily involves: CAVJ, CCM2, FOXFl, FZD4, 
PRSS23, RASAl, SMO, TIPARP, ZPPM2 and Z^IZl- Some of 
these are genes known to cause vascular malformation disorders, 
such as HHT: CCM2 causes cerebral cavernous malformation type 
2 [20] and RASAl causes capillary maJformation-arteriovenous 
malformation syndrome (CM-AVM) [21]. 

Another approach in the data analysis was the correlation study. 
The c()rr(;lations between the expression levels of the 10 highest 
ranking differentially expressed IncRNAs in the HHT total group 
and every transcript present in the Agilent microarray were 
assessed by calculating Pearson's correlation coefficient. Multiple 
statistically significant correlations, mapping to many different 
chromosomes, were present for each IncRNA in question. Hence, 
the correlation analyses may indicate that multiple fran^-regulative 
mechanisms are present. More than half the correlated transcripts 
are IncRNAs and, as these tend to be more tissue-specific than 
mRNAs, it could be that they are all co-regulated. Nevertheless, these 
results suggest only a certain degree of whole-genome correlation, 
and this may not be associated to the HHT phenotype at all. 

In 2007, two gene expression microarray studies were published 
describing HHT samples. Thomas et al. [22] studied cultured 
human umbilical vein endothelial cells from seven newborns 
carrying the familial HHT mutation, using human umbilical vein 
endothelial cells from three healthy newborns as controls. 
Fernandez-Lopez et al. [23] studied blood outgrowth endothelial 
cells from three HHT patients using blood outgrowth endothelial 
cells from healthy donors as controls. Neither of the two papers 
described non-coding RNA, and their raw data have not been 
made available for re-annotation and data analysis in order to 
compare with our results. 

The strength of our study is that affected HHT tissue was 
analysed in a paired design with samples from a relatively large 
number of patients. This resulted in a large number of difierentiy 
expressed IncRNAs, even though we applied a strict cut-off. 
GREAT is a useful tool to relate IncRNAs to known biology, but it 
is limited to c£y-regulation. Of the 617 aberrantiy expressed 
IncRNAs in the HHT group, only 63 were identified in relevant 



GO terms by GREAT, indicating that regulative mechanisms 
other than cir-regulation may be important. Future studies are 
needed to validate the results of our study and to contribute 
knowledge about specific IncRNA functionality and both cis- and 
frawi-regulatory mechanisms. 

In summary, our study identified IncRNAs that are aberrantiy 
expressed in HHT telangiectasia and indicates that IncRNAs may 
contribute to regulate protein-coding loci in HHT, suggesting that 
the IncRNA component of the transcriptome deserves more 
attention in HHT research. Thus, a deeper understanding of 
IncRNAs and their role in tclangi('ctasia formation possesses 
potential for discovering therapeutic targets and for identifying 
new biomarkers. 
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